straight-through attentive routing
STAR-Caps: Capsule Networks with Straight-Through Attentive Routing
Capsule networks have been shown to be powerful models for image classification, thanks to their ability to represent and capture viewpoint variations of an object. However, the high computational complexity of capsule networks that stems from the recurrent dynamic routing poses a major drawback making their use for large-scale image classification challenging. In this work, we propose Star-Caps a capsule-based network that exploits a straight-through attentive routing to address the drawbacks of capsule networks. By utilizing attention modules augmented by differentiable binary routers, the proposed mechanism estimates the routing coefficients between capsules without recurrence, as opposed to prior related work. Subsequently, the routers utilize straight-through estimators to make binary decisions to either connect or disconnect the route between capsules, allowing stable and faster performance. The experiments conducted on several image classification datasets, including MNIST, SmallNorb, CIFAR-10, CIFAR-100, and ImageNet show that Star-Caps outperforms the baseline capsule networks.
Reviews: STAR-Caps: Capsule Networks with Straight-Through Attentive Routing
The presented routing mechanism differs from routing by agreement in that routing coefficients are not iteratively determined by evaluating agreement of votes, but by computing self-attention scores and binary routing decisions for each combination of input and output capsules. The combined routing procedure seems to be novel, while the individual parts are inspired by self-attention in the transformer and Gumbel-softmax decisions as used in discrete domains like text processing. The paper is technically sound and is very well written, precisely explaining the method and architecture. I feel confident that I could reproduce the method given the provided information. The paper achieves its goal in providing a more efficient substitute for routing by agreement, which allows the architecture to be applied on real-world datasets like ImageNet, as is shown by the experiments.
STAR-Caps: Capsule Networks with Straight-Through Attentive Routing
Capsule networks have been shown to be powerful models for image classification, thanks to their ability to represent and capture viewpoint variations of an object. However, the high computational complexity of capsule networks that stems from the recurrent dynamic routing poses a major drawback making their use for large-scale image classification challenging. In this work, we propose Star-Caps a capsule-based network that exploits a straight-through attentive routing to address the drawbacks of capsule networks. By utilizing attention modules augmented by differentiable binary routers, the proposed mechanism estimates the routing coefficients between capsules without recurrence, as opposed to prior related work. Subsequently, the routers utilize straight-through estimators to make binary decisions to either connect or disconnect the route between capsules, allowing stable and faster performance. The experiments conducted on several image classification datasets, including MNIST, SmallNorb, CIFAR-10, CIFAR-100, and ImageNet show that Star-Caps outperforms the baseline capsule networks.